Fast Computation of Entropic Profiles for the Detection of Conservation in Genomes
نویسندگان
چکیده
The information theory has been used for quite some time in the area of computational biology. In this paper we discuss and improve the function Entropic Profile, introduced by Vinga and Almeida in [23]. The Entropic Profiler is a function of the genomic location that captures the importance of that region with respect to the whole genome. We provide a linear time linear space algorithm called Fast Entropic Profile, as opposed to the original quadratic implementation. Moreover we propose an alternative normalization that can be also efficiently implemented. We show that Fast EP is suitable for large genomes and for the discovery of motifs with unbounded length.
منابع مشابه
gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملBehavior-Based Online Anomaly Detection for a Nationwide Short Message Service
As fraudsters understand the time window and act fast, real-time fraud management systems becomes necessary in Telecommunication Industry. In this work, by analyzing traces collected from a nationwide cellular network over a period of a month, an online behavior-based anomaly detection system is provided. Over time, users' interactions with the network provides a vast amount of usage data. Thes...
متن کاملProfile of Eight Prophage Sequences Present in the Genomes of Different Acinetobacter baumannii Strains
ABSTRACT Background and Objective: Prophage sequences are major contributors to interstrain variations within the same bacterial species. Acinetobacter baumannii is a gram-negative bacterium that causes a wide range of nosocomial infections, especially in intensive care unit inpatients. Prophage sequences constitute a considerable proporti...
متن کاملNumerical Computation Of Multi-Component Two-Phase Flow in Cathode Of PEM Fuel Cells
A two-dimensional, unsteady, isothermal and two-phase flow of reactant-product mixture in the air-side electrode of proton exchange membrane fuel cells (PEMFC) is studied numerically in the present study. The mixture is composed of oxygen, nitrogen, liquid water and water vapor. The governing equations are two species conservation, a single momentum equation for mobile mixture, liquid mass cons...
متن کاملEntropic profiles of DNA sequences through chaos-game-derived images.
A new method to determine entropic profiles in DNA sequences is presented. It is based on the chaos-game representation (CGR) of gene structure, a technique which produces a fractal-like picture of DNA sequences. First, the CGR image was divided into squares 4-m in size (m being the desired resolution), and the point density counted. Second, appropriate intervals were adjusted, and then a histo...
متن کامل